Health Data Science Datathon 2024

Can you predict the next pandemic?

Overview

In this repository you will find the subset of the EPIWATCH data we will use for the datathon as well as some details on the dataset.


Accessing the data

readr::read_csv()

Country

Data for twelve countries:

  • United States
  • India
  • China
  • Russian Federation
  • Ukraine
  • United Kingdom
  • Vietnam
  • Indonesia
  • Brazil
  • Australia
  • Argentina
  • Nigeria

There is also:

  • location e.g. Sydney, New South Wales
  • coordinates e.g. [-33.873, 151.205] (points to The Scary Canary on Kent Street so may or may not be accurate)

Diseases

Data for seven diseases

  • Influenza (many strains)
  • Covid-19
  • Mpox
  • Legionnaires’
  • Dengue
  • Measles
  • Cholera

Syndromes

Syndromes refer to more generalised symptoms, usually recorded when the disease is unknown. Common syndromes include

  • Acute gastroenteritis
  • Severe acute respiratory syndrome
  • Febrile syndromes
  • Pneumonia
  • Influenza-like illness
  • See the documentation for details of related transmission types for different syndromes.

Example of syndromes preceding diseases


Early reports like this one refer to an “unexplained type of pneumonia”.

Unexplained pneumonia kills two health workers in Argentina

An unexplained type of pneumonia has killed two health workers in Tucumán, a province in northwestern Argentina, with four patients undergoing intensive care

As the week progresses, the source of the infection becomes clearer and identified as Legionnaires’ Disease in articles like this one

Legionnaires’ outbreak in Tucumán claims sixth life

Latest victim, who died late Sunday, was an 81-year-old patient with comorbidities who had been “in a serious condition” receiving treatment for pneumonia.